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Abstract 



The paper concerns the as ymptotic dist r ibutio n of the mixture 
density estimator, proposed by iLeipus et al.1 (|2006l ). in the aggrega- 
tion/disaggregation problem of random parameter AR(1) process. We 
prove that, under mild conditions on the (semiparametric) form of the 
mixture density, the estimator is asymptotically normal. The proof is 
based on the limit t heory for the quadrat ic form in linear random vari- 
ables developed bv lBhansali et al.l (|2007l ). The moving average repre- 



sentation of the aggregated process is investigated. A small simulation 
study illustrates the result. 
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1 Introduction 



Aggregated time series data appears in different fields of studies including 
applied problems in hydrology, sociology, statistics, economics. Considering 
aggregation as a time series object, a number of important questions arise. 
These comprise the properties of macro level data obtained by small and 
large-scale aggregation in time, space or both, assumptions of when and 
how the inverse (disaggregation) problem can be solved, finally, how to apply 
theoretical results in practice. 

Aggregated time series, in fact, can be viewed as a transformation of the 
underlying time series by some (either linear or non-linear) specific function 
defined at (in) finite set of individual processes. In this paper we consider 
a linear aggregation scheme, which is natural in applications. In practice 
it is found convenient to approximate individual data by simple t i me se - 



ries models, s uch as AR(1 ), GARCHf l, 1) for instance (see iLewbell (|1994l ). 



Chong] ((2006), Zaffaroni (|2004l . [2006)), whereas more complex individual 



data models do not provide an advantage in accuracy and efficiency of esti- 
mates, and usually are very difficult to study from the theoretical point of 
view. 

Aggregation by appropriately averaging the micro level time series mod- 
els can give intriguing results. It was shown in Granger (1980) that the 
large-scale aggregation of infinitely many short memory AR(1) models with 
random coefficients can lead to a long memory fractionally integrated pro- 
cess. It means that the properties of an aggregate time series may in general 
differ from those of individual data. 

It is clear however that the weakest point of the aggregation is a consid- 
erable loss of information about individual characteristics of the underlying 
data. Roughly speaking, an aggregated time series can not be as informative 
about the attributes of individual data as the micro level processes are. On 
the other hand, using some special aggregation schemes, which involve, for 
instance, independent identically distributed "elementary" processes with 
known structure (such as AR(1)), enables to solve an inverse problem: to 
recover the properties of individual series with the aggregated data at hand. 
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This problem is called a disaggregation problem. 

Different aspect s of this probl e m we re investigated in Da cunha-Castelle, 



Oppenheim (2001), iLeipus et al.1 d2006f >. ICelov et al.1 <|2007l >. The last two 
papers deal with the asymptotic statistical theory in the disaggregation 
problem: they present the construction of the mixture density estimate 
of the individual AR(1) models, the consistency of an estimate, and some 
theoretical tools needed here. Resuming the previous research, the major 
objective of the present paper is to obtain the asymptotic normality prop- 
erty of the mixture density estimate, that enlarges the range of applications, 
solving the accuracy of simulation studies, statistical inference, forecasting 
and other problems. 

Section [2] describes the disaggregatio n scheme, inc l uding the construction 
of mixture density estimate proposed bv lLeipus et al.l (|2006l ). and formulates 
the main result of the paper. Important issues about the moving average 
representation of the aggregated process are discussed in Section [3l The 
proof of the main theorem and auxiliary results are given respectively in 
Section [J] and Section [7J Some simulation results are presented in Section [5j 



2 Preliminaries and the main result 



Consider a sequence of independent processes y(i) = {yW,i G Z}, 
defined by the random coefficient AR(1) dynamics 



U) _ „(i)yW , JJ) 
- a i t _ x -t- e t , 



(2.1) 



where ef\ t G Z, j = 1, 2, . . . are independent identically distributed (i.i.d.) 
random variables with Ee^ = and < = E(e t 



(J) ^ 2 < oo; a, aW, j 



1,2, .. . are i.i.d. random variables with \a\ < 1 and satisfying 

1 



E 



1-a 2 

(j) 



< oo. 



(2.2) 



It is assumed that the sequences {e\ ]> ' ,t G Z}, j = 1, 2, . . . and {a,a^\j 
1,2,...} are independent. 
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Under these conditions. (12. ID a d mits a stationary solution Y^) and, ac- 
cording to lOppenheim and Viand (120041 ). the finite dimensional distribu- 
tions of the process 



(N) 



N 



£y t W tez, 



weakly converge as N — > oo to those of a zero mean stationary Gaussian 
process X = {X t ,t £ Z}, called the aggregated process. Suppose that ran- 
dom coefficient a admits a density ip (x), absolutely continuous with respect 
to the Lebesgue measure, which by ()2.2[) satisfies 

da; < oo. (2.3) 



tp{x) 



1 



x' 



Any density function satisfying (|2.3[) will be called a mixture density. 

Note that the covariance function and the spectral density of aggregated 
process X coincides with those of yw) and are given, respectively, by 

x \h\ 



a(h) := Cov(X h ,X ) = a z £ 



(f(x)dx 



and 



/(A) 



2vr 



<p(x) 



1 



,rc 



iA 12 



dx. 



(2.4) 



(2.5) 



The disaggregation problem deals with finding the individual processes 
(if they exist) of form (|2.ip . which produce the aggregated process X with 
given spectral density /(A) (or covariance cr(h)). This is equivalent to finding 
<p(x) such that (|2,5p (or (|2.4p ) and f)2.3|) hold. In this case, we say that the 
mixture density ip(x) is associated with the spectral density /(A). 

In order to esti mate the mixt u re de nsity <p(x) using aggregated obser- 



vations Xi, . . . , X n , iLeipus et al.l (|2006l ) proposed the estimate based on 



a decomposition of function £(x) = ip(x)(l 
L 2 (w( a ^ )-b&sis of Gegenbauer polynomials {G 
w 
L 2 



x 
(«)/ 

k V 



in the orthonormal 
k = 0, 1, . . . }, where 



,(«)( x ) = (l 
(u;( Q ))) if 



x 2 ) a , a > 



-i 



1. This decomposition is valid (i.e. ( belongs to 

ip 2 (x) 



dx < oo, a > —1. 



(2.6) 
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Let Gn (x) = j=o 9n \ < x3 ■ ^he resulting estimate has the form 



0n(x) =^ 2 e(l-x 2 rY,(n,kG[ a \x), (2.7) 

k=0 

where the Cn,k are estimates of the coefficients Cfc in the a-Gegenbauer ex- 
pansion of the function ((x) = YlkLo Ck^k ( x ) anci are given by 

k 

L,k = J>g(*nO') ~ CT n (j + 2)), (2.8) 
3=0 

tine = <5"n(0) — <T n (2) is the consistent estimator of variance a 2 and a n (j) = 
71 1 J27=i X-iXi + j is the sample covariance of the aggregated process. Trun- 
cation level K n satisfies 

K n = [jlogn], 0< 7 < (21og(l + ^))- 1 . (2.9) 



Leipus et al.l (|2006l ) assumed the following semiparametric form of the 



mixture density: 

ip{x) = (1 -x) l ~ 2dl (l +x) 1 - 2d2 ip(x), 0<di,da<l/2, (2.10) 

where ip{x) is continuous on [—1, 1] and does not vanish at ±1. Then, 
under conditions above and corresponding relations between a and di,d,2, 
they showed the consistency of the estimator n (x) assuming that the vari- 
ance of the noise, a 2 , is known and equals 1. In more realistic situation 
of unknown a 2 , it must be consistently estimated. In order to understand 
intuitively the construction of estimator cf 2 e , it suffices to note that, by 
(|2.4p . a 2 = <r(0) — <t(2). Also note that the estimator (p n {x) in (|2,7p pos- 
sesses property J^ 1 ip n (x)dx = 1, which can be easily verified noting that 
f\(l — x 2 ) a G^\x) dx = (goo) -1 if ^ = 0, and = otherwise, implying 

/i K n 
(1 _ X 2 )Q L,kG { k a) (X)dx = Cn,0/9 { 0fi = *»(0) " *r»(2) 
1 k=0 



by 
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In this paper, we further study the properties of the proposed mixture 
density estimator. In order to formulate the theorem about the asymptotic 
normality of estimator cp n (x), we will assume that aggregated process Xt, 
t £ Z admits the following linear representation. 

Assumption A Assume that Xt, t £ Z is a linear sequence 

oo 

X t = Y,^jZ t - j , (2.11) 

where the Zt are i.i.d. random variables with zero mean, finite fourth moment 
and the coefficients ipj satisfy 

^ ~ c/-\ \iPj - ^+i| = 0(/- 2 ), < d < 1/2 (2.12) 

with some constant c 7^ 0. 

We also introduce the following condition on the mixture density (p(x). 
Assumption B Assume that mixture density ip has a form 

<p(x) = {l-x) 1 ~ 2d ip{x), 0<d<l/2, (2.13) 

where ip{x) is a nonnegative function with supp(V') C [—1,1], continuous at 
x = 1, if>(l) + 0. 

Note that, omitting in (|2.10p the factor responsible for the seasonal part, 
we thus obtain the corresponding 'long memory' spectral density with sin- 
gularity at zero (but not necessary at ±7r) and the corresponding behavior 
of the coefficients ipj in linear representation (|3.2|) . 

Theorem 2.1 LetXt,t £ Z be the aggregated process satisfying Assumption 
A and corresponding to the mixture density given by Assumption B. Assume 
that (|2.6p holds, and d and a satisfy the following condition 

5 

-l/2<a<--4d. (2.14) 
Let K n be given in \2. 9(1 with 7 satisfying 

0<7< (21og(l + v / 2)) _1 (l-max|a + 4d-^,o}). (2.15) 
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Then for every fixed x £ (— 1, 1), such that (f(x) / 0, it holds 

ft,(x)-E^(z)^ N(())1)| (21g) 
V /Var((^ n (x)) 

Proof of the theorem is given in Section [U 

Remark 2.1 Suppose that (p(x) satisfies Assumption B. Then assumption 
(|2.6p is equivalent to J_ x ^/? 2 (a;)(l + £)~ a d:r < oo and a < 3 — 4cZ. The last 
inequality is implied by (I2.14|) . 

Example 2.1 Assume two mixture densities 

<p(x; d) = CiCdJa^-^l - x) 1 " M (l + z)l( 0)1] (a), < d < 1/2, (2.17) 
where Ci(d) = ar^rg^d) , and 

¥> fl (x;/e) =C 2 (K)\x\ K l ha ^ 0] (x), k>0, (2.18) 

where < a* < 1, C 2 (k) = (k + lXa*) - " -1 . 

According to Dacunha-Castelle and Oppenheim (2001), the spectral den- 
sity corresponding to (p(x; d) is FARIMA(0,d,0) spectral density 

f(X;d) = — (2sinU) . 2 .19 



2vr V 2 

Also, since the support of (p g lies inside (—1,1), the spectral density 
g( A; k) correspondin g to ip g {x\n) is analytic function (see Proposition 3.3 



m 



Celov et al. 



(|2007m . 



Consider the spectral density given by 

f(X) = f(X;d)g(X; K ), Ae[-7r,7r]. (2.20) 

It can be shown that the mixture density <p(x) associated with /(A) (|2.20p is 
supported on [—a*, 1], satisfies Assumption B with ip{x) which is continuous 
function on [—a*, 1] and at the neighborhood of zero satisfies ip(x) = 0(\x\ d ). 
This implies the validity of condition (|2.6p needed to obtain the correspond- 
ing a-Gegenbauer expansion. For the proof of this example and precise 
asymptotics of ip(x) at zero see Appendix A. 
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Finally, the aggregated process X, obtained using such mixture den- 
sity ip(x), satisfies Assumption A by Proposition 13.21 which shows that as- 
sumptions A and B are satisfied under general 'aggregated' spectral density 
/(A) = f(X;d)g(X), where g(X) is analytic function on [— tt,7t] and the asso- 
ciated mixture density is supported on [— a*, 0] with some < a* < 1. 

Remark 2.2 Note that the 'FARIMA mixture density' (pUT]) . due to factor 
does not satisfy (|2.6p and a "compensating" density such as ip g (x; k) in 
(|2,18p is needed in order to obtain the needed integrability in the neighbor- 
hood of zero. Obviously, for the same aim, other mixture densities instead 
of ip g (x; k) (|2.18p can be employed. 

3 Moving average representation of the aggre- 
gated process 

In order to obtain the asymptotic normality result in Theorem 12.11 an im- 
portant assumption is that the aggregated process admits a linear represen- 
tation with coefficients decaying at an appropriate rate (see 



Bhansali et al. 



(|2007l )). The related issues about the moving average representation of the 
aggregated process are discussed in this section. 

From the aggregating scheme follows that any aggregated process ad- 
mits an absolutely continuous spectral measure. If, in addition, its spectral 
density, say, /(A) satisfies 

r log/(A)dA> -oo, (3.1) 

J — 7T 



then the function 



h{z) = exp{^-y B eiX * Z z log/(A)dA|, \z\ < 1, 

is an outer function from the Hardy space H 2 , does not vanish for \z\ < 1 and 
/(A) = |/i(e lA )| 2 . Then, by the Wold decomposition theorem, corresponding 
process Xt is purely nondeterministic and has the MA(oo) representation 



S 



(see Anderson (|197ll . Ch. 7.6.3)) 



OO 



't-v 



(3.2) 



3=0 



where the coefficients ipj are defined from the expansion of normalized outer 
function h(z)/h(0), YlJLo' 1 ^] < °°> = 1> anc ^ %t = X t — X t , t = 0, 1, . . . 
(Aj is the optimal linear predictor of Xt) is the innovation process, which is 
zero mean, uncorrelated, with variance 



By construction, the aggregated process is Gaussian, implying that the in- 
novations Zt are i.i.d. N(0, a 2 ) random variables. 

Next we focus on the class of semiparametric mixture densities satisfying 
Assumption B. As it was mentioned earlier, this form is natural, in particular 
it covers the mixture densities ipi(x;d) and <p(x) in Example 12.11 

Proposition 3.1 Let the mixture density (f(x) satisfies Assumption B. As- 
sume that either 

(i) supp(^) = [—1, 1] and ip(x) = ip{x){l + x) 2d ~ l is continuous at —1 and 



(ii) supp(V ; ) C [—a*, 1] with some < a* < 1. 

Then the aggregated process admits a moving average representation (|3.2p . 
where the Zt are Gaussian i.i.d. random variables with zero mean and vari- 
ance (|3.3p . 

Proof, (i) We have to verify that (|3.ip holds. Rewrite ^p(x) in the form 




(3.3) 



V>(-1) / with some < d < 1/2 



or 



ip(x) = (1 - x) 



l-2d 



(l + x) 1 " 2 ^). 



Proposition 4.1 in lCelov et al.l (|2007l ) implies 



/(A)~Ci|A| 



|A|-0, 
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with Ci > 0. Hence log /(A) ~ logCi - C 2 log|l - e lA |, |A| —> 0, where 
C2 > 0. For any e > choose < Ao < vr/3, such that 

6 2 log 1 1 — e lA I 
Since — log |1 — e lA | > for < A < vr/3, we obtain 

f °log/(A)dA> A logCi-C 2 (l-e) f ° log |1 - e iA |dA > -00 (3.4) 
Jo Jo 

using the well known fact that Jq log |1 — e lA |dA = 0. Similarly, 

/"7T 

/ log/(A)dA > -00. (3.5) 

J 7T— Aq 

When A £ [Ao, n — Ao], there exist < L\ < L2 < 00 such that 

Li < — ; — - — ^7 < L 2 
2vr|l - xe lA | 2 ~ 

uniformly in x £ ( — 1,1). Thus, by (|2.5p . L\ < /(A) < L2 for any A £ 
[Aq,7t — Aq], and therefore 



/ °log/(A)dA > -00. (3.6) 



(■■n— Ao 
'Ao 

([33I)-(IM]) imply inequality (ETTj) . 

The proof in case (ii) is analogous to (i) and, thus, is omitted. □ 

Lemma 3.1 If the spectral density g(X) of the aggregated process X t ,t £ Z 
is analytic function on [—tt,tt], then Xt admits representation 

00 

X t = y~]gjZ t -j, 

j=0 

where the Z t are i.i.d. Gaussian random variables with zero mean and vari- 
ance 

a 2 g = 2^exp{i-y log 5 (A)dA} (3.7) 
and the gj satisfy \ Y^T=o9j\ < 00, g = 1. 
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PROOF. From Proposition 3.3 in ICelov et al.l (|2007l ) it follows that there 
exists < a* < 1 such that 



J2 r-a 



9W = ?/"'iT M 4s d - < 3 ' 8 > 
Z7r J_ aat \1 — xe 1A \ z 



For all x £ [— a*, a*] and A E [0, n] we have 

1 



II — xe 



iA 1 2 



> C 3 > 0, 



where C3 = 6*3(0*). This and (|3.8[) imply Jj^ log g(A)dA > —00. Finally, 



Sjt=o I < 00 follows from representation 

9 00 



j=0 

and the assumption of analyticity of g. □ 

Proposition 3.2 Let Xt,t £ Z be an aggregated process with spectral den- 
sity 

f(X) = f(X;d)g(\), (3.9) 

where /(A; d) is FARIMA spectral density (|2.19|) and g(X) is analytic spectral 
density. Then: 

(i) if mixture density (p g (x) associated with g{\) satisfies supp(y? 9 ) C [— a*,0] 
with some < a* < 1, then f>{x), associated with f{\), satisfies Assumption 
B. 

(ii) Xt admits a linear representation ()3.2p . where the Zt are Gaussian i.i.d. 
random variables with zero mean and variance 

■2 



a 2 



2vrexp{i- jT log /(A)cL\} = exp { i- £ log<?(A)dA} = J 



and the coefficients ipj satisfy 

v-*oo 

^~ j^r jd ~^ \^-^ + i\=o{f-% (3.10) 

where ipo = l. (Here, the gj. are given in Lemma \S.l\ ) 
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PROOF, (i) By Corollary 3.1 in 



Celovetal 



(|20071 ). the mixture density 



associated with the "product" spectral density (|3.9I) exists and has a form 



<p(x) 



with 



y/x 



■ + <Pg{X) 



ip(y;d)dy 



o ( 1 -xy){l-y/x)J' 
(3.11) 



l / r o 



<p(x;<Q<p g (y) \ 
1 - xy J 



(3.12) 



where f(x; d) is given in (|2.17p and is associated with the spectral density 
f(X;d), and (f g (x) is associated with the spectral density g(X). Clearly, this 
implies that Assumption B is satisfied. 

(ii) We have 

> 2 



3=0 



with hi 



r(i + i)r(d) 



and, recall, 

o oo 2 oo 

3=0 j=0 

since, by Lemma 13.11 f_ log g(X) d A > — oo. On the other hand, 
/_ log /(A)dA > — oo implies 



-j oo 2 00 



2tt 

3=0 3=0 

and, by uniqueness of the representation, 

k 
3=0 

It easy to see that, 



3C 



3=0 3=0 



(3.13) 
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where C4 = T 1 (d)Y^'jLo9j- Indeed, taking into account that hk ~ 
T^ 1 (d)k d ^ 1 , we can write 

k 00 

^2 h k-j9j = r _1 ((i)A; d_1 ^2 a k:j gj, 
i=0 j=0 

where a k j = hk-jT{d)k l ~ d l^j < ^ — ► 1 as k — > 00 for each j. On the other 
hand, we have \a k j\ < C(l + uniformly in and, since the gj decay 

exponentially fast, the sum Yl'jLo^~ lr j) l ~ d \9j \ converges and the dominated 
convergence theorem applies to obtain (|3.13p . 
Hence, we can write 

a 2 1 00 



3=0 



2 

, ^0 = 1, 



where ipj = ^j^phtjog ~ C^j . Thus, representation (13. 2\) and the first 
relation in (|3. 10[) follows. 

Finally, in order to check the second relation in (|3.10p . it suffices to note 
that 

j 

tpj - ip j+ i = ^(hj-i - h j+1 ^i)gi - g j+1 , 



i=0 



where hj — hj + \ ~ C5J 2 and the gj decay exponentially fast. 



4 Proof of main result 



□ 



In order to prove Theorem 12.11 we use the result of iBhansali et al.l (|20Q7I ) , 
who considered the following quadratic form 

n 

Qn,X = d n (t- s)X t X s , 

t,s=l 

where the Xt are linear sequences satisfying Assumption A and the function 
d n {k) satisfies the following assumption. 

Assumption C Suppose that 

ifcA, 



d n (k) = / ^(AK^dA 
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with some even real function %(A), such that, for some — 1 < (3 < 1 and a 
sequence of constants m n > 0, it holds 

MA) | <m n |AK AG [-vr,7r]. (4.1) 



Denote by E n a matrix (e n (t — s))t, s =i,...,n 5 where 



e n (i - s) = 
and let \\E n f = YJl, s =i " «)• 



7 ?n (A)/(A)e iA (*- s )dA 



(4.2) 



Theorem 4.1 



Bhansali et al 



(|2007l )] Suppose that assumptions A and C 
are satisfied. If 2d + 13 < 1/2 and 



r n = o(\\E n \ 



where 



m n n 
m n log n 



max(0,2d+/3) -y 2( f + /? / 0, 

if 2d + /? = 0, 



(4.3) 



(4.4) 



then, as n —* oo, it ZioWs 



and 



Var(Q n , x ) x ||£ n || 2 



N(0,1) 



v /Var(Q„ i x) 

fidere for a n ,b n > 0, a n x 6 n means that C§b n < a n < C-jb n for some 
C 6 ,C 7 >0.) 

Proof of Theorem 12.11 First of all, note that 



a 



2 P 2 

— * cr e 



which easily follows using Theorem 3 in iHoskina (jl996l ). Hence, to obtain 
convergence ()2. 16j) , we can replace the factor a\ £ by in the definition of 
<p n (x). Without loss of generality assume that <rf = 1. 
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Rewrite the estimate (p n { x ) in a form 
K n k 

= a - ^rEE^S^O') - *»cj + 2))4 a) w 

fc=0 j=0 

= (l-x 2 rE4 Q) ^E^ / (e iAj -e^ +2 ))/„(A)dA 
fc=0 j=o 

7 ?n (A;x)/ n (A)dA, (4.5) 



where 



r, n (X;x) := (1 - x*T E E^g(e lAj - e^' +2 >) (4.6) 

A;=o j=o 

and I„(A) = (27rn) _1 | X)j=i ^j eljA | 2 ) A G [— vr, 7r] is the periodogram. 

Now the proof follows from Assumption A and the results obtained in 
Lemma 14.11 and Lemma 14.21 below, which imply that, under appropriate 
choice of m n and /3, all the assumptions in Theorem 14.11 are satisfied. In 
particular, by Lemma |4.H the following bound for the kernel rj n (\;x) holds 

\Vn(X;x)\ <m n \\\-f, (4.7) 

where 

m n = C 8 n^ 1+ ^\ /3 = f-| (4.8) 

Cg is a positive constant, depending on x and a. Clearly, (12.141) implies that 
-K P < 5 - 2d < \ and 2d + < \. 

Consider the cases 2d+(3 < or < 2d+[3 < 1/2. In the case 2d+(3 < 0, 
from (|4.4p . (|4.8|) we obtain 



C 8 



n7 lo g (i+V2) if 2d+ § - I < 0, 



7 iog(i+V2) logn if 2d + « - | = 0. 



Hence, by Lemma |4T2| r n ||£? n || 1 — > because 7log(l + v2) < 1/2. 
Assume now 2d + (3 > 0. Then 

r n = C8n 7iog(i+V2)+2 d+ f-| 
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and rnWEnW' 1 -» by $ZJB- D 

The following lemma shows that the kernel rj n (X; x) given in (|4,6|) satisfies 
inequality (|4.7p with m n and (3 given in 



Lemma 4.1 For quantity r] n (X;x) given in (|4.6p and for every fixed x £ 
(-1,1), < |A| < 7T ii ZioZds 



(1 - X 2)«/2-l/4 ^ a >-l/2, 
(l-x 2 ) a if -1 <a< -1/2, 
where Cg depends on a, and 7 is (/wen in ([O^. 



Lemma 4.2 Assume that a mixture density ip{x) satisfies condition (|2.6p 
and Zei -?T n — > 00. T/ien /or every x € (—1,1), smc/i i/iai </?(x) ^ it holds 

||£n|| 2 >Cion(l + o(l)), (4.9) 

where C10 > is positive constant depending on a and x. 

Proof of these two lemmas are given in Appendix B. 



5 A simulation study 

In order to gain further insight into the asymptotic normality property of the 
mixture density estimator (|2.7j) . in this section we conduct a Monte-Carlo 
simulation study. Several examples are considered, which correspond to the 
mixture densities having different shapes (here we do not pose a question 
which rigorous aggregating schemes lead to the latter). 
The following two families of mixture densities 

<p(x) = w<p\(x) + (1 — w)(f2(x), < w < 1, 

are considered: 

• Beta-type mixture densities defined by 

cpxix) oc x Pl - 1 (l-x) 9l - 1 l [ o, 1] (x), pi>0, qx>0, 

(p 2 (x) oc |x| P2_1 (a =tI + x) 92 " 1 !^^^]^), p 2 > 0, q 2 > 0, < a* < 1; 
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• mixed (Beta and Uniform)-type mixture densities denned by 

i Pl (x) oc a J*- 1 (l-a;)« J - 1 l [ o,i ] (ar) J Ps > 0, q 3 > 0, 
V2{x) = a' 1 l[_ a , )0 ](a;) 3 < a* < 1. 

In order to construct the mixture density estimator, in the first step, the 
parameters K n and a must be chosen. Preliminary Monte-Carlo simulations 
showed that the estimator (p n {x) has the minimal mean integrated square 
error (MISE) when the parameter a is chosen to be equal 1 — 2c?. The jus- 
tification of this interesting conjecture remains an open problem. This rule 
also ensures that ()2.14j) is satisfied. The number of Gegenbauer polynomials 
K n is chosen according to (|2.9p . Note that, by construction, the estimator 
(p n {x) is not necessary positive, though it integrates to one. 

In Figure 1, we present three graphs and corresponding box plots for the 
mixture densities of the form above. Cases 1 and 2 correspond to the Beta- 
type mixture densities, Case 3 corresponds to the mixed (Beta and Uniform)- 
type mixture density. The parameter values are presented in Table [TJ The 
box plots are obtained by a Monte-Carlo procedure based on M = 500 
independent replications with sample size n = 1500 and bandwidth K n = 3 
(we aggregate N = 5000 i.i.d. AR(1) processes). Individual innovations ef^ 
are i.i.d. N(0, 1). Note that the mixture density in Case 2 corresponds to 
Example 1 2 . 1 1 wit h the parameters d = 0.2, k = 0.1 (in the sense of behavior 
at zero). 





w 


a* 


(Pl,Ql) (f>2,<?2) 


(P3 


,93) 


d 


a 


Case 1 


0.8 


0.95 


(3.0, 1.5) (2.0, 1.0) 






0.25 


0.5 


Case 2 


0.8 


0.80 


(1.2, 1.6) (1.3, 2.5) 






0.20 


0.6 


Case 3 


0.8 


0.90 




(2.0. 


, 1.2) 


0.40 


0.2 






Table 1: 


Parameter values in 


cases 


1-3. 







Box plots in Figure 1 show that <p n approximates the mixture density 
well when n is sufficiently large. However, when the sample size is relatively 
small it is difficult to estimate the mixture density of the shape as in cases 
2-3. This can be explained by the construction of the estimator which 
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(a) Case 1 



(b) Case 2 




(c) Case 3 

Figure 1: True mixture densities (solid line) and the box plots of the esti- 
mates. Number of replications M = 500, sample size n = 1500. 

assumes rather smooth form of the mixture density around zero. On the 
other hand, it is clear that the AR(1) parameter values which are close 
to zero does not affect the long memory property. For our purposes, an 
important fact is that the estimator correctly approximates the density at 
the neighborhood of x = 1. This enables us to estimate the unknown (in 
real applications) parameter d using a log-log regression on periodogram at 
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Case 1,x = -0.5 



Case 1,x = -0.5 



Case 1.x = 0.96 




Figure 2: QQ plots and histograms of the estimates at points x = —0.5 and 
x = 0.96. Number of replications M = 500, sample size n = 1500. 



the neighborhood of this point (for example Geweke and Porter-Hudak or 
Whittle- type estimators). 

Figure 2 supplements the earlier findings and shows that the distribution 
of estimator is approximately normal^] QQ-plots and histograms are given 
for fixed values x = —0.5 and x = 0.96 correspondingly. We use the same 
number of replications M = 500 and sample size n = 1500. 

The last Monte-Carlo experiment aims to show that the decay rate of 
Vax(<p n (x)) is n~ 7 with 7 = 1. This ensures that the variance is decreasing 
fast enough. To do this, we calculate the log-log regression of variance 
on the length of time series n G {500, 600, . . . , 1400, 1500, 2000, . . . , 5000}. 
Figure [3] demonstrates the corresponding parameter estimates at different 



1 The Shapiro- Wilk test confirms that in most cases normality hypothesis is consistent 
with the data. 
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(a) Case 1, x = -0.5 (b) Case 1, x = 0.96 

Figure 3: log- log scale regression of the variance of <p n (%) as a function of 
n. The variance is estimated using M = 500 independent replications. 

points and shows that 7 1. 

6 Appendix A. Proof of Example 12.11 



By Corollary 3.1 in iCelov et al.l (120071 ). the mixture density <p(x), x £ 



[— a*,l] associated with /(A) (|2.20p is given by equality (|3,lip . where 
(f g (x) = ip g (x;n). Clearly, in this case, (|3.1ip can be rewritten in form 
(I2TT31) with 

iP(x) = C(Mx)+Mx)), (6-1) 
where C = C\{d)C2{n)C~ 1 is positive constant, 

= x--(l + x)l ( „, 1 | W /_° > {1 _ x ^_ y/x) iV, (6.2) 



Denote by F(a, b; c; x) a hypergeometric function 

r(c) / 
r(6)r( c -6)7 



F(a, 6; c; x) = --ff - C t b -\l - t) M (l - tx)- a dt, 
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with c>6>0ifx<l and, in addition, c — a — 6 > if x = 1. Then the 
corresponding integrals in i/ji(x) and ip2(x) can be rewritten as 

a, (i - - 

< +1 x(F(l, k + 1;k + 2; -a*x) - k + 1; « + 2; -a*/x)) 



«+ 1 

a K+1 



1 — x 2 



x, as I-* 0+, 



and 



K+l 

lyd-l {l _ y) l-2 d{l+y) 



dy 



o (1 - xy){\ - y/x) 

r(d)r(2 - 2d) d; 2 - d; 1/x) - xF(l, d; 2 - d; x) 
~~ r(2 - 2d) 1 -x 

~ T(d)r(l - d)|x| d , as x -►()-, 

where the last asymptotics follow from the well known pro perties of the 
hypergeometric functions (see lAbramovitz and Stegunl (119651 )). 
Thus, from (I6.2I)-(I6.3D we obtain that 



■ipi(x) 



K+l 



x , as i-» 0+, 



ip 2 {x) ~ r(d)T(l -d)|x| K+d , as x^O 
(|6.1|) and relations (|6.4p - (j6.5p complete the proof. 



(6.4) 
(6.5) 

□ 



7 Appendix B. Proofs of lemmas I4.1H4.2 

Proof of Lemma |4~TI By p~6j) . 

(l-x 2 )"V(A;x) 



AY, 



E G l Q) (-)E^S( elAj - eiA0+2) ) 

i=o 



fc=0 



K n k 



d-e 2iA )E4 a, (-)E^> 

fc=0 j=o 
(l-e 2iA )E G l 0) W G l a) (e iA 

k=0 
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This and Lemma |7. II below implies 



Kn 



{l-x 2 )- a \r, n (X;x)\ < Cn|A|-( 2 "- 3 )/ 4 £ \G ( «\x)\(l + V2) k . (7.1 



fc=0 



Now, using the fact that for all — 1 < x < 1 

Ci 2 (l - x 2 rf"3 if a > -1/2 



14^)1 < 



C 
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if a<-l/2, a ^ -3/2,-5/2, ... 



(see inequality (7.33.6) in ISzegol (|l967l ) and (3.9) in iLeipus et al.1 (120061 )) 
and (123]), we get from {HE]) 

(l-ar 2 )- a |r7 n (A;^| < C 13 |A|-( 2q - 3 )/ 4 (1 + V2) K " 

= c , i3 | A |-(2 Q -3)/4 e ^„log(l+ v / 2) 

< c 9 |Ar( 2a - 3 )/ 4 ^ io s( i+ ^. 

□ 



Lemma 7.1 For all k > 0, a > —1, (a ^ —1/2) and < |A| < tt it holds 

|(1 - e 2iA )Gi a) (e iA )| < Cu(l + V2) fc |A|"( 2a - 3 )/ 4 , 
where constant Cxi depends on a. 



Szegol (jl967l ) implies that for the usual (non- 



Proof. Theorem 8.21.10 of 
normalized) Gegenbauer polynomials with a > — 1, a ^ — 1/2 it holds 



C, 



( a+1 /2)( e iA) 



T(k + a + \ 



r(fc + i)r(a + 



— z k {l-z- 2 )- a - l / 2 + 0{k a - z ' 2 \z\ k ), (7.2) 



where the complex numbers w = e lA and z are connected by the elementary 
conformal mapping 

1 



w 



-(z + z x ), z = w + (w 2 - 1) 



1/2 



(7.3) 



and z satisfies \z\ > 1 (thus, A ^ 0, ±7r) 
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Recall that the normalized Gegenbauer polynomials G, (z) are linked 
to c|, a+1 ^ 2 ^ (z) by equality 

G (*) (z) _ -l/2 c (a+l/2) ( wh _ J[ r(fc + 2a + l) 

Therefore, in terms of the normalized Gegenbauer polynomials, (|7.2p reads 
as follows 

G ( a)(eU) = sgn(a + 1/2)2° ^ _ z _ 2 1/2 + (k^\ z \ k ) : (7.4) 

7T i / z 

where 

(A; + a + l/2) 1 /2r(A; + a + l/2) 1 
Ofc = tttt, ; — tttt, ; > 1 as k — > oo. 

T 1 /2(A; + l)r 1 /2(A; + 2a + l) 



iA 



Prom (|7.3|) we obtain for u> = e 

2 i 1 2/i -2\2 
W -1 = -Z (1-Z ) , 

which together with (|7.4p yields 

(1 _ e^A^H^iA) = _ sgn(a + 1/2)2° 6 ^ +2 _ ^-.+3/2 + 0(A ._i 

Since |z| > 1 and z 2 - 1 = 2(e 2iA - 1) + 2e 3iA / 2 (e iA - e" 1 ^ 1 / 2 , we have 

\l-z- 2 \ < \z 2 -l\ 

< 2|e 2iA -l|+2|e iA -e- iA | 1/2 
= 4| sin A| +2^2\ sin A| 1/2 

< (4 + 2v / 2)|A| 1/2 . (7.5) 

So that, by (l7T4l)-(l731). 

|(1 - e 2iA )G^(e iA )| < C 14 6 fc |z| fc |A|-( 2Q - 3 )/ 4 , (7.6) 

where C14 = 614(0:). 

Finally, the straightforward verification shows that 

sup | e iA + (e 2iA -l) 1 /2| = 1 + y2. 

Ae[-7r,7r] 
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This completes the proof of lemma. □ 
PROOF of Lemma I4T21 Using (|4.2|) . (|4.6p rewrite the coefficients of £"„ 

e n (t-s) = (l-x"rJ2G^(x)^g^ / /(A)(e^-^)-e^^ +2 ))dA. 

fc=0 j=0 w 

Using the expression of the covariance function of an aggregated process, we 
have for t — s + j > 

f(X)(e iX{t - s+j) -e iX{t - s+j+2) )d\ = a(t-s + j)-a(t-s + j + 2) 

J — 77 



o* I ^ s+ ^(y)d y . 



Thus, assuming a 2 = 1, for t — s > we have 

K n k „i 

k=0 j=0 

K n pi k 

k=0 J 1 j=0 

= (l-x 2 rY, G k\x) y t ~ s ^y)G[ a) (y)dy. 
k=o J - 1 

Integral f\ y m ip(y)G^\y)dy (m is a nonnegative integer), appearing in 
the last expression is nothing else but the kth coefficient, t/j m k, hi the a- 
Gegenbauer expansion of the function 



x m (p(x) 

(i - x 2 y 



M*) = j^,, (7.7) 



which obviously satisfies tp m 6 L 2 (w^). Therefore, 

Kn 



e n (t-s) = (l-x 2 rJ2 G kH^\t-s\,k 



k=0 

(oo 
i — is _l_1 



X)iP\ t -s\,h 

k=K„+l 
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and, denoting R n {m) := Y^k=K n +i (x)ip^ m ^ k , \m\ < n, we have 

(l-x 2 )- 2a \\E n f = J2( n -\m\)U m] (x)- £ {x)^ m A 

\m\<n ^ k=K n +l ' 

= E ( n - \ m \)^\m\( x ) - 2 E {n - \m\)ijj\ m \{x)R n {m) 

\m\<n \m\<n 

+ Y ( n ~ H)#nM = : A l,n ~ 2M,n + ^3,n- 
\m\ <n 

Now, we prove that, as n — > 00, 

A 1( „ ~ C 15 n, (7.8) 
where C15 = Ci5(x) > is some positive constant, and 

A 2 , n = o(n). (7.9) 



Since the last term A^^ is nonnegative by construction, this will prove (|4.9p . 
At points x where <p(x) 7^ we have 

<^ 2 (x) v-^ 1 1 i\ 2|m| 
A ^n = (1 _ x 2 ) 2a E {n-\m\)x 1 1 

\m\<n 

ip 2 (x)(l + X 2 ) 

~ " (1 _ X 2)2 Q +1 » aS 

which gives (|7.8p . Consider term ^2,™- By (|7.7p . 



^2,n = J] ( n ~ H)VV|( X ) ^ ^ 



00 

m\ ,k 



\m\<n k=K„+l 



00 „i 

a E E (n-lmD^Hdy 

z — 1 1 «/ — 1 11 ,„ 



(l - x 2 Y 

y ' k=K n +l "~ x \m\<n 

(1^2) a ( B l.n ~ B 2." ~ jB3 '™)' 
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where 



oo „\ oo 

B hn : = n E G f( x ) / ^(y)G ( k\v) E te/) |m| di/> 

k=K n +l m=-oo 
°° f 1 1 -I- 



fc=fsr«+i 

00 „j 



b 2 ,„ := E G i (*) / vtoGviv) E Mfo/) H dvi 

fc=A" n +l |m|<n 
00 „i 

B 3) „ := n E / v(y)Gt\y) E (^) |m| d2/- 

J IS 1 1 «/ — 1 I I \„ 



Since, by (123 



fc=A" n +l " x |m|>n 

~ / x _ v(y) 1 + xy 

(1 - y 2 ) 1 - zy 

satisfies ^ £ L 2 (w^) and ET n — > 00, the sum X}fclft: n +i i n ^i,n vanishes 
(as the tail of the convergent series). So that, B\ )n = o(n) and, similarly, 
B$,n = o(n). 
Finally, 



B 



2,/t 



E Gf (*) ^)G(%) dy = (1) 

=K n +i J - 1 \ y> 



using the similar argument as in the case of term B\^ n . This completes the 
proof of (|7,9p and of the lemma. □ 
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